Overview

Dataset statistics

Number of variables16
Number of observations876199
Missing cells5549308
Missing cells (%)39.6%
Duplicate rows27528
Duplicate rows (%)3.1%
Total size in memory107.0 MiB
Average record size in memory128.0 B

Variable types

Categorical5
Text9
Numeric2

Alerts

category_large_desc has constant value ""Constant
Dataset has 27528 (3.1%) duplicate rowsDuplicates
weight is highly imbalanced (85.4%)Imbalance
season is highly imbalanced (52.4%)Imbalance
item has 187615 (21.4%) missing valuesMissing
color has 152570 (17.4%) missing valuesMissing
use has 629267 (71.8%) missing valuesMissing
material has 101411 (11.6%) missing valuesMissing
print_pattern has 462777 (52.8%) missing valuesMissing
characteristic has 319976 (36.5%) missing valuesMissing
detail has 270666 (30.9%) missing valuesMissing
age has 872423 (99.6%) missing valuesMissing
weight has 832911 (95.1%) missing valuesMissing
season has 875313 (99.9%) missing valuesMissing
sensibility has 844379 (96.4%) missing valuesMissing
sale_price is highly skewed (γ1 = 628.0384855)Skewed
recent_sale_count is highly skewed (γ1 = 159.6980281)Skewed
recent_sale_count has 854339 (97.5%) zerosZeros

Reproduction

Analysis started2023-12-03 13:39:57.027129
Analysis finished2023-12-03 13:40:33.656094
Duration36.63 seconds
Software versionydata-profiling vv4.6.2
Download configurationconfig.json

Variables

category_large_desc
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.7 MiB
여성복
876199 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2628597
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row여성복
2nd row여성복
3rd row여성복
4th row여성복
5th row여성복

Common Values

ValueCountFrequency (%)
여성복 876199
100.0%

Length

2023-12-03T22:40:33.725478image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-03T22:40:33.818584image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
여성복 876199
100.0%

Most occurring characters

ValueCountFrequency (%)
876199
33.3%
876199
33.3%
876199
33.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2628597
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
876199
33.3%
876199
33.3%
876199
33.3%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2628597
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
876199
33.3%
876199
33.3%
876199
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2628597
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
876199
33.3%
876199
33.3%
876199
33.3%
Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.7 MiB
상의
382391 
아우터
164949 
원피스
139365 
팬츠
120049 
스커트
69445 

Length

Max length3
Median length2
Mean length2.4265686
Min length2

Characters and Unicode

Total characters2126157
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row원피스
2nd row원피스
3rd row원피스
4th row원피스
5th row원피스

Common Values

ValueCountFrequency (%)
상의 382391
43.6%
아우터 164949
18.8%
원피스 139365
 
15.9%
팬츠 120049
 
13.7%
스커트 69445
 
7.9%

Length

2023-12-03T22:40:33.897192image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-03T22:40:34.000399image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
상의 382391
43.6%
아우터 164949
18.8%
원피스 139365
 
15.9%
팬츠 120049
 
13.7%
스커트 69445
 
7.9%

Most occurring characters

ValueCountFrequency (%)
382391
18.0%
382391
18.0%
208810
9.8%
164949
7.8%
164949
7.8%
164949
7.8%
139365
 
6.6%
139365
 
6.6%
120049
 
5.6%
120049
 
5.6%
Other values (2) 138890
 
6.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2126157
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
382391
18.0%
382391
18.0%
208810
9.8%
164949
7.8%
164949
7.8%
164949
7.8%
139365
 
6.6%
139365
 
6.6%
120049
 
5.6%
120049
 
5.6%
Other values (2) 138890
 
6.5%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2126157
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
382391
18.0%
382391
18.0%
208810
9.8%
164949
7.8%
164949
7.8%
164949
7.8%
139365
 
6.6%
139365
 
6.6%
120049
 
5.6%
120049
 
5.6%
Other values (2) 138890
 
6.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2126157
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
382391
18.0%
382391
18.0%
208810
9.8%
164949
7.8%
164949
7.8%
164949
7.8%
139365
 
6.6%
139365
 
6.6%
120049
 
5.6%
120049
 
5.6%
Other values (2) 138890
 
6.5%
Distinct34
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.7 MiB
티셔츠
165902 
롱원피스
104055 
블라우스/셔츠
102167 
니트
95974 
자켓
66596 
Other values (29)
341505 

Length

Max length8
Median length7
Mean length3.7408328
Min length2

Characters and Unicode

Total characters3277714
Distinct characters58
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row롱원피스
2nd row롱원피스
3rd row롱원피스
4th row롱원피스
5th row롱원피스

Common Values

ValueCountFrequency (%)
티셔츠 165902
18.9%
롱원피스 104055
11.9%
블라우스/셔츠 102167
11.7%
니트 95974
11.0%
자켓 66596
7.6%
카디건 52966
 
6.0%
데님팬츠 48218
 
5.5%
면바지 43523
 
5.0%
롱스커트 40602
 
4.6%
점퍼 28776
 
3.3%
Other values (24) 127420
14.5%

Length

2023-12-03T22:40:34.106038image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
티셔츠 165902
18.9%
롱원피스 104055
11.9%
블라우스/셔츠 102167
11.7%
니트 95974
11.0%
자켓 66596
7.6%
카디건 52966
 
6.0%
데님팬츠 48218
 
5.5%
면바지 43523
 
5.0%
롱스커트 40602
 
4.6%
점퍼 28776
 
3.3%
Other values (24) 127420
14.5%

Most occurring characters

ValueCountFrequency (%)
348990
 
10.6%
327939
 
10.0%
268401
 
8.2%
209922
 
6.4%
185044
 
5.6%
144657
 
4.4%
132111
 
4.0%
132111
 
4.0%
/ 120515
 
3.7%
119451
 
3.6%
Other values (48) 1288573
39.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3147127
96.0%
Other Punctuation 120515
 
3.7%
Uppercase Letter 10072
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
348990
 
11.1%
327939
 
10.4%
268401
 
8.5%
209922
 
6.7%
185044
 
5.9%
144657
 
4.6%
132111
 
4.2%
132111
 
4.2%
119451
 
3.8%
112239
 
3.6%
Other values (46) 1166262
37.1%
Other Punctuation
ValueCountFrequency (%)
/ 120515
100.0%
Uppercase Letter
ValueCountFrequency (%)
H 10072
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3147127
96.0%
Common 120515
 
3.7%
Latin 10072
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
348990
 
11.1%
327939
 
10.4%
268401
 
8.5%
209922
 
6.7%
185044
 
5.9%
144657
 
4.6%
132111
 
4.2%
132111
 
4.2%
119451
 
3.8%
112239
 
3.6%
Other values (46) 1166262
37.1%
Common
ValueCountFrequency (%)
/ 120515
100.0%
Latin
ValueCountFrequency (%)
H 10072
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3147127
96.0%
ASCII 130587
 
4.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
348990
 
11.1%
327939
 
10.4%
268401
 
8.5%
209922
 
6.7%
185044
 
5.9%
144657
 
4.6%
132111
 
4.2%
132111
 
4.2%
119451
 
3.8%
112239
 
3.6%
Other values (46) 1166262
37.1%
ASCII
ValueCountFrequency (%)
/ 120515
92.3%
H 10072
 
7.7%

item
Text

MISSING 

Distinct391666
Distinct (%)56.9%
Missing187615
Missing (%)21.4%
Memory size6.7 MiB
2023-12-03T22:40:34.718709image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Length

Max length114
Median length84
Mean length24.748719
Min length2

Characters and Unicode

Total characters17041572
Distinct characters397
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique330125 ?
Unique (%)47.9%

Sample

1st row플리츠원피스/루즈핏원피스/여름원피스/스트링원피스/반팔원피스/롱원피스
2nd row롱셔츠/셔츠원피스/배색원피스/플리츠원피스/롱셔츠원피스
3rd row프릴원피스/캉캉원피스/러플원피스/레이스원피스
4th row여름원피스/러플원피스/나들이원피스/프릴원피스/스트라이프원피스
5th row데일리원피스/스퀘어넥원피스
ValueCountFrequency (%)
캐주얼티셔츠 4078
 
0.6%
데일리티셔츠 2000
 
0.3%
맨투맨 1997
 
0.3%
데일리니트 1911
 
0.3%
베이식후드집업 1819
 
0.3%
캐주얼니트 1581
 
0.2%
캐주얼원피스 1549
 
0.2%
하객룩원피스 1386
 
0.2%
데일리원피스 1347
 
0.2%
데일리스커트 1306
 
0.2%
Other values (391656) 669610
97.2%
2023-12-03T22:40:35.446590image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 2143530
 
12.6%
1195153
 
7.0%
852435
 
5.0%
791007
 
4.6%
534614
 
3.1%
500814
 
2.9%
445195
 
2.6%
404099
 
2.4%
401966
 
2.4%
396103
 
2.3%
Other values (387) 9376656
55.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 14845751
87.1%
Other Punctuation 2143530
 
12.6%
Decimal Number 26345
 
0.2%
Uppercase Letter 25946
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1195153
 
8.1%
852435
 
5.7%
791007
 
5.3%
534614
 
3.6%
500814
 
3.4%
445195
 
3.0%
404099
 
2.7%
401966
 
2.7%
396103
 
2.7%
340689
 
2.3%
Other values (374) 8983676
60.5%
Decimal Number
ValueCountFrequency (%)
7 10382
39.4%
5 4741
18.0%
9 2797
 
10.6%
8 2559
 
9.7%
4 2540
 
9.6%
6 1265
 
4.8%
3 1239
 
4.7%
0 414
 
1.6%
1 408
 
1.5%
Uppercase Letter
ValueCountFrequency (%)
A 14149
54.5%
H 8602
33.2%
U 3195
 
12.3%
Other Punctuation
ValueCountFrequency (%)
/ 2143530
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 14845751
87.1%
Common 2169875
 
12.7%
Latin 25946
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1195153
 
8.1%
852435
 
5.7%
791007
 
5.3%
534614
 
3.6%
500814
 
3.4%
445195
 
3.0%
404099
 
2.7%
401966
 
2.7%
396103
 
2.7%
340689
 
2.3%
Other values (374) 8983676
60.5%
Common
ValueCountFrequency (%)
/ 2143530
98.8%
7 10382
 
0.5%
5 4741
 
0.2%
9 2797
 
0.1%
8 2559
 
0.1%
4 2540
 
0.1%
6 1265
 
0.1%
3 1239
 
0.1%
0 414
 
< 0.1%
1 408
 
< 0.1%
Latin
ValueCountFrequency (%)
A 14149
54.5%
H 8602
33.2%
U 3195
 
12.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 14845751
87.1%
ASCII 2195821
 
12.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 2143530
97.6%
A 14149
 
0.6%
7 10382
 
0.5%
H 8602
 
0.4%
5 4741
 
0.2%
U 3195
 
0.1%
9 2797
 
0.1%
8 2559
 
0.1%
4 2540
 
0.1%
6 1265
 
0.1%
Other values (3) 2061
 
0.1%
Hangul
ValueCountFrequency (%)
1195153
 
8.1%
852435
 
5.7%
791007
 
5.3%
534614
 
3.6%
500814
 
3.4%
445195
 
3.0%
404099
 
2.7%
401966
 
2.7%
396103
 
2.7%
340689
 
2.3%
Other values (374) 8983676
60.5%

color
Text

MISSING 

Distinct26546
Distinct (%)3.7%
Missing152570
Missing (%)17.4%
Memory size6.7 MiB
2023-12-03T22:40:35.766448image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Length

Max length67
Median length60
Mean length9.6004734
Min length2

Characters and Unicode

Total characters6947181
Distinct characters43
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13058 ?
Unique (%)1.8%

Sample

1st row네온/베이지
2nd row베이지/네이비/블랙/블루/그린
3rd row블랙/그레이
4th row아이보리/브라운/베이지/네이비/그레이/블랙
5th row블랙/그레이
ValueCountFrequency (%)
아이보리/블랙 27921
 
3.9%
블랙 27423
 
3.8%
블랙/베이지 26831
 
3.7%
블랙/화이트 22243
 
3.1%
라이트블루 13683
 
1.9%
블루 13082
 
1.8%
아이보리 11273
 
1.6%
블랙/그레이 10987
 
1.5%
화이트 10918
 
1.5%
베이지 10813
 
1.5%
Other values (26536) 548455
75.8%
2023-12-03T22:40:36.439661image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 1311920
18.9%
965991
13.9%
585057
 
8.4%
405467
 
5.8%
323194
 
4.7%
259361
 
3.7%
259361
 
3.7%
259361
 
3.7%
249487
 
3.6%
228349
 
3.3%
Other values (33) 2099633
30.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5635261
81.1%
Other Punctuation 1311920
 
18.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
965991
17.1%
585057
 
10.4%
405467
 
7.2%
323194
 
5.7%
259361
 
4.6%
259361
 
4.6%
259361
 
4.6%
249487
 
4.4%
228349
 
4.1%
186720
 
3.3%
Other values (32) 1912913
33.9%
Other Punctuation
ValueCountFrequency (%)
/ 1311920
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5635261
81.1%
Common 1311920
 
18.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
965991
17.1%
585057
 
10.4%
405467
 
7.2%
323194
 
5.7%
259361
 
4.6%
259361
 
4.6%
259361
 
4.6%
249487
 
4.4%
228349
 
4.1%
186720
 
3.3%
Other values (32) 1912913
33.9%
Common
ValueCountFrequency (%)
/ 1311920
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5635261
81.1%
ASCII 1311920
 
18.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 1311920
100.0%
Hangul
ValueCountFrequency (%)
965991
17.1%
585057
 
10.4%
405467
 
7.2%
323194
 
5.7%
259361
 
4.6%
259361
 
4.6%
259361
 
4.6%
249487
 
4.4%
228349
 
4.1%
186720
 
3.3%
Other values (32) 1912913
33.9%

use
Text

MISSING 

Distinct10874
Distinct (%)4.4%
Missing629267
Missing (%)71.8%
Memory size6.7 MiB
2023-12-03T22:40:36.713217image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Length

Max length84
Median length81
Mean length7.848432
Min length2

Characters and Unicode

Total characters1938029
Distinct characters313
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6342 ?
Unique (%)2.6%

Sample

1st row데일리룩
2nd row55사이즈
3rd row55사이즈
4th row55사이즈/날씬해보이는
5th row55사이즈/날씬해보이는
ValueCountFrequency (%)
55사이즈 36618
 
14.8%
데일리룩 29911
 
12.1%
체형커버 16837
 
6.8%
아웃핏 9059
 
3.7%
나들이룩/나들이/데일리룩 8478
 
3.4%
데이트룩/데일리룩 5520
 
2.2%
코디룩 4869
 
2.0%
오피스룩 4838
 
2.0%
날씬해보이는 4762
 
1.9%
커플룩/커플 3862
 
1.6%
Other values (10864) 122178
49.5%
2023-12-03T22:40:37.132546image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
287093
 
14.8%
/ 206346
 
10.6%
134796
 
7.0%
127217
 
6.6%
5 102450
 
5.3%
87448
 
4.5%
86943
 
4.5%
51343
 
2.6%
51341
 
2.6%
48784
 
2.5%
Other values (303) 754268
38.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1627715
84.0%
Other Punctuation 206346
 
10.6%
Decimal Number 102460
 
5.3%
Uppercase Letter 1508
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
287093
17.6%
134796
 
8.3%
127217
 
7.8%
87448
 
5.4%
86943
 
5.3%
51343
 
3.2%
51341
 
3.2%
48784
 
3.0%
48274
 
3.0%
42915
 
2.6%
Other values (291) 661561
40.6%
Uppercase Letter
ValueCountFrequency (%)
O 698
46.3%
T 349
23.1%
D 349
23.1%
X 55
 
3.6%
S 55
 
3.6%
H 2
 
0.1%
Decimal Number
ValueCountFrequency (%)
5 102450
> 99.9%
0 4
 
< 0.1%
2 3
 
< 0.1%
4 2
 
< 0.1%
3 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
/ 206346
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1627715
84.0%
Common 308806
 
15.9%
Latin 1508
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
287093
17.6%
134796
 
8.3%
127217
 
7.8%
87448
 
5.4%
86943
 
5.3%
51343
 
3.2%
51341
 
3.2%
48784
 
3.0%
48274
 
3.0%
42915
 
2.6%
Other values (291) 661561
40.6%
Common
ValueCountFrequency (%)
/ 206346
66.8%
5 102450
33.2%
0 4
 
< 0.1%
2 3
 
< 0.1%
4 2
 
< 0.1%
3 1
 
< 0.1%
Latin
ValueCountFrequency (%)
O 698
46.3%
T 349
23.1%
D 349
23.1%
X 55
 
3.6%
S 55
 
3.6%
H 2
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1627715
84.0%
ASCII 310314
 
16.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
287093
17.6%
134796
 
8.3%
127217
 
7.8%
87448
 
5.4%
86943
 
5.3%
51343
 
3.2%
51341
 
3.2%
48784
 
3.0%
48274
 
3.0%
42915
 
2.6%
Other values (291) 661561
40.6%
ASCII
ValueCountFrequency (%)
/ 206346
66.5%
5 102450
33.0%
O 698
 
0.2%
T 349
 
0.1%
D 349
 
0.1%
X 55
 
< 0.1%
S 55
 
< 0.1%
0 4
 
< 0.1%
2 3
 
< 0.1%
H 2
 
< 0.1%
Other values (2) 3
 
< 0.1%

material
Text

MISSING 

Distinct1541
Distinct (%)0.2%
Missing101411
Missing (%)11.6%
Memory size6.7 MiB
2023-12-03T22:40:37.417887image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Length

Max length30
Median length29
Mean length5.6378648
Min length1

Characters and Unicode

Total characters4368150
Distinct characters51
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique518 ?
Unique (%)0.1%

Sample

1st row폴리에스터/코튼
2nd row폴리에스터/나일론
3rd row실크/스판/레이온
4th row폴리에스터
5th row코튼
ValueCountFrequency (%)
코튼 194045
25.0%
폴리에스터 138492
17.9%
폴리에스터/코튼 55796
 
7.2%
아크릴 39528
 
5.1%
폴리에스터/스판 23857
 
3.1%
폴리에스터/아크릴 23724
 
3.1%
폴리에스터/실크/스판/레이온 17770
 
2.3%
스판/코튼 15683
 
2.0%
14932
 
1.9%
리넨 13340
 
1.7%
Other values (1531) 237621
30.7%
2023-12-03T22:40:37.815206image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 517146
11.8%
460581
10.5%
396328
9.1%
362500
 
8.3%
362500
 
8.3%
362500
 
8.3%
344418
 
7.9%
340183
 
7.8%
177650
 
4.1%
106770
 
2.4%
Other values (41) 937574
21.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3851004
88.2%
Other Punctuation 517146
 
11.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
460581
12.0%
396328
10.3%
362500
9.4%
362500
9.4%
362500
9.4%
344418
8.9%
340183
8.8%
177650
 
4.6%
106770
 
2.8%
106770
 
2.8%
Other values (40) 830804
21.6%
Other Punctuation
ValueCountFrequency (%)
/ 517146
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3851004
88.2%
Common 517146
 
11.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
460581
12.0%
396328
10.3%
362500
9.4%
362500
9.4%
362500
9.4%
344418
8.9%
340183
8.8%
177650
 
4.6%
106770
 
2.8%
106770
 
2.8%
Other values (40) 830804
21.6%
Common
ValueCountFrequency (%)
/ 517146
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3851004
88.2%
ASCII 517146
 
11.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 517146
100.0%
Hangul
ValueCountFrequency (%)
460581
12.0%
396328
10.3%
362500
9.4%
362500
9.4%
362500
9.4%
344418
8.9%
340183
8.8%
177650
 
4.6%
106770
 
2.8%
106770
 
2.8%
Other values (40) 830804
21.6%

print_pattern
Text

MISSING 

Distinct3781
Distinct (%)0.9%
Missing462777
Missing (%)52.8%
Memory size6.7 MiB
2023-12-03T22:40:38.107210image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Length

Max length39
Median length33
Mean length4.5031735
Min length1

Characters and Unicode

Total characters1861711
Distinct characters76
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1670 ?
Unique (%)0.4%

Sample

1st row
2nd row
3rd row
4th row스트라이프/별
5th row
ValueCountFrequency (%)
133710
32.3%
체크 27097
 
6.6%
스트라이프 16067
 
3.9%
레터링 14471
 
3.5%
스트라이프/스트라이프 14431
 
3.5%
레터링/레터링 13918
 
3.4%
별/체크 13320
 
3.2%
체크/체크 13283
 
3.2%
플로럴 12635
 
3.1%
플로럴/플로럴 12518
 
3.0%
Other values (3692) 141972
34.3%
2023-12-03T22:40:38.526140image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 284193
15.3%
211230
 
11.3%
107532
 
5.8%
105410
 
5.7%
104417
 
5.6%
98857
 
5.3%
84136
 
4.5%
82272
 
4.4%
82077
 
4.4%
79900
 
4.3%
Other values (66) 621687
33.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1577518
84.7%
Other Punctuation 284193
 
15.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
211230
 
13.4%
107532
 
6.8%
105410
 
6.7%
104417
 
6.6%
98857
 
6.3%
84136
 
5.3%
82272
 
5.2%
82077
 
5.2%
79900
 
5.1%
79857
 
5.1%
Other values (65) 541830
34.3%
Other Punctuation
ValueCountFrequency (%)
/ 284193
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1577518
84.7%
Common 284193
 
15.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
211230
 
13.4%
107532
 
6.8%
105410
 
6.7%
104417
 
6.6%
98857
 
6.3%
84136
 
5.3%
82272
 
5.2%
82077
 
5.2%
79900
 
5.1%
79857
 
5.1%
Other values (65) 541830
34.3%
Common
ValueCountFrequency (%)
/ 284193
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1577518
84.7%
ASCII 284193
 
15.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 284193
100.0%
Hangul
ValueCountFrequency (%)
211230
 
13.4%
107532
 
6.8%
105410
 
6.7%
104417
 
6.6%
98857
 
6.3%
84136
 
5.3%
82272
 
5.2%
82077
 
5.2%
79900
 
5.1%
79857
 
5.1%
Other values (65) 541830
34.3%

characteristic
Text

MISSING 

Distinct33989
Distinct (%)6.1%
Missing319976
Missing (%)36.5%
Memory size6.7 MiB
2023-12-03T22:40:38.845040image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Length

Max length59
Median length51
Mean length7.8989039
Min length3

Characters and Unicode

Total characters4393552
Distinct characters182
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20690 ?
Unique (%)3.7%

Sample

1st row탄탄한/하이퀄리티
2nd row산뜻한
3rd row편안한
4th row텐션감/심플한
5th row부드러운/시원한
ValueCountFrequency (%)
부드러운 25449
 
4.6%
깔끔한 23116
 
4.2%
여성스러운 19381
 
3.5%
고급스러운 16248
 
2.9%
유니크한 15272
 
2.7%
편안한 15231
 
2.7%
탄탄한 12829
 
2.3%
여리여리한 11908
 
2.1%
심플한 10507
 
1.9%
가벼운 10355
 
1.9%
Other values (33979) 395927
71.2%
2023-12-03T22:40:39.298733image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
538275
 
12.3%
/ 468064
 
10.7%
303510
 
6.9%
262607
 
6.0%
185407
 
4.2%
180074
 
4.1%
148791
 
3.4%
112370
 
2.6%
93492
 
2.1%
91831
 
2.1%
Other values (172) 2009131
45.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3924556
89.3%
Other Punctuation 468064
 
10.7%
Decimal Number 912
 
< 0.1%
Uppercase Letter 20
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
538275
 
13.7%
303510
 
7.7%
262607
 
6.7%
185407
 
4.7%
180074
 
4.6%
148791
 
3.8%
112370
 
2.9%
93492
 
2.4%
91831
 
2.3%
88251
 
2.2%
Other values (168) 1919948
48.9%
Uppercase Letter
ValueCountFrequency (%)
K 10
50.0%
C 10
50.0%
Other Punctuation
ValueCountFrequency (%)
/ 468064
100.0%
Decimal Number
ValueCountFrequency (%)
9 912
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3924556
89.3%
Common 468976
 
10.7%
Latin 20
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
538275
 
13.7%
303510
 
7.7%
262607
 
6.7%
185407
 
4.7%
180074
 
4.6%
148791
 
3.8%
112370
 
2.9%
93492
 
2.4%
91831
 
2.3%
88251
 
2.2%
Other values (168) 1919948
48.9%
Common
ValueCountFrequency (%)
/ 468064
99.8%
9 912
 
0.2%
Latin
ValueCountFrequency (%)
K 10
50.0%
C 10
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3924556
89.3%
ASCII 468996
 
10.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
538275
 
13.7%
303510
 
7.7%
262607
 
6.7%
185407
 
4.7%
180074
 
4.6%
148791
 
3.8%
112370
 
2.9%
93492
 
2.4%
91831
 
2.3%
88251
 
2.2%
Other values (168) 1919948
48.9%
ASCII
ValueCountFrequency (%)
/ 468064
99.8%
9 912
 
0.2%
K 10
 
< 0.1%
C 10
 
< 0.1%

detail
Text

MISSING 

Distinct44959
Distinct (%)7.4%
Missing270666
Missing (%)30.9%
Memory size6.7 MiB
2023-12-03T22:40:39.698900image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Length

Max length70
Median length66
Mean length6.2933515
Min length1

Characters and Unicode

Total characters3810832
Distinct characters124
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27934 ?
Unique (%)4.6%

Sample

1st row맥시/워싱
2nd row핀턱/스트링/플리츠
3rd row슬릿/맥시/셔링/스트링/나시/랩
4th row라운드넥/스트레이트핏
5th row크롭/히든버튼/플리츠
ValueCountFrequency (%)
크롭 29755
 
4.9%
루즈핏 28182
 
4.7%
밴딩 26985
 
4.5%
스트링 15799
 
2.6%
워싱 15361
 
2.5%
라운드넥 10784
 
1.8%
셔링 7749
 
1.3%
나시 7749
 
1.3%
와이드 7167
 
1.2%
프릴 6682
 
1.1%
Other values (44949) 449320
74.2%
2023-12-03T22:40:40.261977image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 620862
 
16.3%
190397
 
5.0%
138428
 
3.6%
133942
 
3.5%
126989
 
3.3%
126989
 
3.3%
123073
 
3.2%
116776
 
3.1%
108777
 
2.9%
108016
 
2.8%
Other values (114) 2016583
52.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3117102
81.8%
Other Punctuation 620862
 
16.3%
Uppercase Letter 39231
 
1.0%
Decimal Number 33637
 
0.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
190397
 
6.1%
138428
 
4.4%
133942
 
4.3%
126989
 
4.1%
126989
 
4.1%
123073
 
3.9%
116776
 
3.7%
108777
 
3.5%
108016
 
3.5%
95513
 
3.1%
Other values (103) 1848202
59.3%
Uppercase Letter
ValueCountFrequency (%)
H 12329
31.4%
L 8835
22.5%
X 8835
22.5%
V 6533
16.7%
U 2699
 
6.9%
Decimal Number
ValueCountFrequency (%)
5 9972
29.6%
2 8835
26.3%
9 6466
19.2%
7 5962
17.7%
8 2402
 
7.1%
Other Punctuation
ValueCountFrequency (%)
/ 620862
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3117102
81.8%
Common 654499
 
17.2%
Latin 39231
 
1.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
190397
 
6.1%
138428
 
4.4%
133942
 
4.3%
126989
 
4.1%
126989
 
4.1%
123073
 
3.9%
116776
 
3.7%
108777
 
3.5%
108016
 
3.5%
95513
 
3.1%
Other values (103) 1848202
59.3%
Common
ValueCountFrequency (%)
/ 620862
94.9%
5 9972
 
1.5%
2 8835
 
1.3%
9 6466
 
1.0%
7 5962
 
0.9%
8 2402
 
0.4%
Latin
ValueCountFrequency (%)
H 12329
31.4%
L 8835
22.5%
X 8835
22.5%
V 6533
16.7%
U 2699
 
6.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3117102
81.8%
ASCII 693730
 
18.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 620862
89.5%
H 12329
 
1.8%
5 9972
 
1.4%
L 8835
 
1.3%
X 8835
 
1.3%
2 8835
 
1.3%
V 6533
 
0.9%
9 6466
 
0.9%
7 5962
 
0.9%
U 2699
 
0.4%
Hangul
ValueCountFrequency (%)
190397
 
6.1%
138428
 
4.4%
133942
 
4.3%
126989
 
4.1%
126989
 
4.1%
123073
 
3.9%
116776
 
3.7%
108777
 
3.5%
108016
 
3.5%
95513
 
3.1%
Other values (103) 1848202
59.3%

age
Text

MISSING 

Distinct55
Distinct (%)1.5%
Missing872423
Missing (%)99.6%
Memory size6.7 MiB
2023-12-03T22:40:40.510022image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Length

Max length19
Median length3
Mean length5.8342161
Min length3

Characters and Unicode

Total characters22030
Distinct characters19
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)0.3%

Sample

1st row40대/50대
2nd row20대
3rd row20대
4th row20대
5th row40대/50대
ValueCountFrequency (%)
20대/30대 645
17.1%
직장인 629
16.7%
30대 537
14.2%
40대 341
9.0%
40대/30대/10대/20대/50대 331
8.8%
20대 249
 
6.6%
1020 195
 
5.2%
50대 134
 
3.5%
2030 91
 
2.4%
대학생 79
 
2.1%
Other values (45) 545
14.4%
2023-12-03T22:40:40.861460image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 5961
27.1%
5158
23.4%
/ 2560
11.6%
3 1916
 
8.7%
2 1802
 
8.2%
4 1007
 
4.6%
655
 
3.0%
655
 
3.0%
655
 
3.0%
1 625
 
2.8%
Other values (9) 1036
 
4.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 11922
54.1%
Other Letter 7548
34.3%
Other Punctuation 2560
 
11.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5158
68.3%
655
 
8.7%
655
 
8.7%
655
 
8.7%
173
 
2.3%
149
 
2.0%
37
 
0.5%
24
 
0.3%
24
 
0.3%
9
 
0.1%
Other values (2) 9
 
0.1%
Decimal Number
ValueCountFrequency (%)
0 5961
50.0%
3 1916
 
16.1%
2 1802
 
15.1%
4 1007
 
8.4%
1 625
 
5.2%
5 611
 
5.1%
Other Punctuation
ValueCountFrequency (%)
/ 2560
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 14482
65.7%
Hangul 7548
34.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5158
68.3%
655
 
8.7%
655
 
8.7%
655
 
8.7%
173
 
2.3%
149
 
2.0%
37
 
0.5%
24
 
0.3%
24
 
0.3%
9
 
0.1%
Other values (2) 9
 
0.1%
Common
ValueCountFrequency (%)
0 5961
41.2%
/ 2560
17.7%
3 1916
 
13.2%
2 1802
 
12.4%
4 1007
 
7.0%
1 625
 
4.3%
5 611
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14482
65.7%
Hangul 7548
34.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 5961
41.2%
/ 2560
17.7%
3 1916
 
13.2%
2 1802
 
12.4%
4 1007
 
7.0%
1 625
 
4.3%
5 611
 
4.2%
Hangul
ValueCountFrequency (%)
5158
68.3%
655
 
8.7%
655
 
8.7%
655
 
8.7%
173
 
2.3%
149
 
2.0%
37
 
0.5%
24
 
0.3%
24
 
0.3%
9
 
0.1%
Other values (2) 9
 
0.1%

weight
Categorical

IMBALANCE  MISSING 

Distinct13
Distinct (%)< 0.1%
Missing832911
Missing (%)95.1%
Memory size6.7 MiB
가벼운
40125 
경량
 
1397
무거운
 
640
경량/가벼운
 
489
묵직한
 
315
Other values (8)
 
322

Length

Max length11
Median length3
Mean length3.031764
Min length2

Characters and Unicode

Total characters131239
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row가벼운
2nd row가벼운
3rd row가벼운
4th row가벼운
5th row가벼운

Common Values

ValueCountFrequency (%)
가벼운 40125
 
4.6%
경량 1397
 
0.2%
무거운 640
 
0.1%
경량/가벼운 489
 
0.1%
묵직한 315
 
< 0.1%
가벼운/무거운 224
 
< 0.1%
경량/무거운 41
 
< 0.1%
가벼운/묵직한 36
 
< 0.1%
경량/가벼운/무거운 15
 
< 0.1%
경량/무거운/묵직한 2
 
< 0.1%
Other values (3) 4
 
< 0.1%
(Missing) 832911
95.1%

Length

2023-12-03T22:40:40.976948image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
가벼운 40125
92.7%
경량 1397
 
3.2%
무거운 640
 
1.5%
경량/가벼운 489
 
1.1%
묵직한 315
 
0.7%
가벼운/무거운 224
 
0.5%
경량/무거운 41
 
0.1%
가벼운/묵직한 36
 
0.1%
경량/가벼운/무거운 15
 
< 0.1%
경량/무거운/묵직한 2
 
< 0.1%
Other values (3) 4
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
41816
31.9%
40891
31.2%
40891
31.2%
1945
 
1.5%
1945
 
1.5%
925
 
0.7%
925
 
0.7%
/ 830
 
0.6%
357
 
0.3%
357
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 130409
99.4%
Other Punctuation 830
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
41816
32.1%
40891
31.4%
40891
31.4%
1945
 
1.5%
1945
 
1.5%
925
 
0.7%
925
 
0.7%
357
 
0.3%
357
 
0.3%
357
 
0.3%
Other Punctuation
ValueCountFrequency (%)
/ 830
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 130409
99.4%
Common 830
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
41816
32.1%
40891
31.4%
40891
31.4%
1945
 
1.5%
1945
 
1.5%
925
 
0.7%
925
 
0.7%
357
 
0.3%
357
 
0.3%
357
 
0.3%
Common
ValueCountFrequency (%)
/ 830
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 130409
99.4%
ASCII 830
 
0.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
41816
32.1%
40891
31.4%
40891
31.4%
1945
 
1.5%
1945
 
1.5%
925
 
0.7%
925
 
0.7%
357
 
0.3%
357
 
0.3%
357
 
0.3%
ASCII
ValueCountFrequency (%)
/ 830
100.0%

season
Categorical

IMBALANCE  MISSING 

Distinct16
Distinct (%)1.8%
Missing875313
Missing (%)99.9%
Memory size6.7 MiB
봄옷
391 
봄나들이
338 
여름룩
86 
휴가철룩
 
18
봄여름신상
 
17
Other values (11)
 
36

Length

Max length9
Median length8
Mean length3.1004515
Min length2

Characters and Unicode

Total characters2747
Distinct characters28
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)0.6%

Sample

1st row봄나들이
2nd row봄나들이
3rd row봄나들이
4th row봄나들이
5th row봄나들이

Common Values

ValueCountFrequency (%)
봄옷 391
 
< 0.1%
봄나들이 338
 
< 0.1%
여름룩 86
 
< 0.1%
휴가철룩 18
 
< 0.1%
봄여름신상 17
 
< 0.1%
겨울코디룩 13
 
< 0.1%
봄신상룩 8
 
< 0.1%
여자겨울데일리룩 3
 
< 0.1%
봄옷코디/봄옷 3
 
< 0.1%
가을원피스룩 2
 
< 0.1%
Other values (6) 7
 
< 0.1%
(Missing) 875313
99.9%

Length

2023-12-03T22:40:41.069270image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
봄옷 391
44.1%
봄나들이 338
38.1%
여름룩 86
 
9.7%
휴가철룩 18
 
2.0%
봄여름신상 17
 
1.9%
겨울코디룩 13
 
1.5%
봄신상룩 8
 
0.9%
여자겨울데일리룩 3
 
0.3%
봄옷코디/봄옷 3
 
0.3%
가을원피스룩 2
 
0.2%
Other values (6) 7
 
0.8%

Most occurring characters

ValueCountFrequency (%)
765
27.8%
400
14.6%
340
12.4%
340
12.4%
340
12.4%
132
 
4.8%
106
 
3.9%
103
 
3.7%
26
 
0.9%
26
 
0.9%
Other values (18) 169
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2742
99.8%
Other Punctuation 5
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
765
27.9%
400
14.6%
340
12.4%
340
12.4%
340
12.4%
132
 
4.8%
106
 
3.9%
103
 
3.8%
26
 
0.9%
26
 
0.9%
Other values (17) 164
 
6.0%
Other Punctuation
ValueCountFrequency (%)
/ 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2742
99.8%
Common 5
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
765
27.9%
400
14.6%
340
12.4%
340
12.4%
340
12.4%
132
 
4.8%
106
 
3.9%
103
 
3.8%
26
 
0.9%
26
 
0.9%
Other values (17) 164
 
6.0%
Common
ValueCountFrequency (%)
/ 5
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2742
99.8%
ASCII 5
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
765
27.9%
400
14.6%
340
12.4%
340
12.4%
340
12.4%
132
 
4.8%
106
 
3.9%
103
 
3.8%
26
 
0.9%
26
 
0.9%
Other values (17) 164
 
6.0%
ASCII
ValueCountFrequency (%)
/ 5
100.0%

sensibility
Text

MISSING 

Distinct150
Distinct (%)0.5%
Missing844379
Missing (%)96.4%
Memory size6.7 MiB
2023-12-03T22:40:41.334692image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Length

Max length16
Median length12
Mean length3.3600566
Min length2

Characters and Unicode

Total characters106917
Distinct characters123
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique49 ?
Unique (%)0.2%

Sample

1st row감각있는
2nd row청순
3rd row청순
4th row사랑스런
5th row독특한
ValueCountFrequency (%)
청순 7632
24.0%
독특한 7023
22.1%
러블리티 5823
18.3%
사랑스런 2078
 
6.5%
무드있는 1822
 
5.7%
글래머러스한 1383
 
4.3%
귀염 1156
 
3.6%
키치한 1051
 
3.3%
패미닌 565
 
1.8%
통통튀는 530
 
1.7%
Other values (140) 2757
 
8.7%
2023-12-03T22:40:41.752225image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10801
 
10.1%
7965
 
7.4%
7964
 
7.4%
7318
 
6.8%
7185
 
6.7%
7185
 
6.7%
6098
 
5.7%
5849
 
5.5%
5849
 
5.5%
3756
 
3.5%
Other values (113) 36947
34.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 106339
99.5%
Other Punctuation 578
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10801
 
10.2%
7965
 
7.5%
7964
 
7.5%
7318
 
6.9%
7185
 
6.8%
7185
 
6.8%
6098
 
5.7%
5849
 
5.5%
5849
 
5.5%
3756
 
3.5%
Other values (112) 36369
34.2%
Other Punctuation
ValueCountFrequency (%)
/ 578
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 106339
99.5%
Common 578
 
0.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10801
 
10.2%
7965
 
7.5%
7964
 
7.5%
7318
 
6.9%
7185
 
6.8%
7185
 
6.8%
6098
 
5.7%
5849
 
5.5%
5849
 
5.5%
3756
 
3.5%
Other values (112) 36369
34.2%
Common
ValueCountFrequency (%)
/ 578
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 106339
99.5%
ASCII 578
 
0.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
10801
 
10.2%
7965
 
7.5%
7964
 
7.5%
7318
 
6.9%
7185
 
6.8%
7185
 
6.8%
6098
 
5.7%
5849
 
5.5%
5849
 
5.5%
3756
 
3.5%
Other values (112) 36369
34.2%
ASCII
ValueCountFrequency (%)
/ 578
100.0%

sale_price
Real number (ℝ)

SKEWED 

Distinct9231
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean44300.126
Minimum10
Maximum1 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.7 MiB
2023-12-03T22:40:41.888834image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile13900
Q122900
median32500
Q348400
95-th percentile108000
Maximum1 × 108
Range99999990
Interquartile range (IQR)25500

Descriptive statistics

Standard deviation122560.76
Coefficient of variation (CV)2.7666007
Kurtosis506040.15
Mean44300.126
Median Absolute Deviation (MAD)11600
Skewness628.03849
Sum3.8815726 × 1010
Variance1.5021139 × 1010
MonotonicityNot monotonic
2023-12-03T22:40:42.011289image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
19800 12297
 
1.4%
19900 11190
 
1.3%
29900 10972
 
1.3%
29800 10328
 
1.2%
32000 9993
 
1.1%
18900 9627
 
1.1%
39000 9149
 
1.0%
29000 8855
 
1.0%
34000 8831
 
1.0%
14900 8802
 
1.0%
Other values (9221) 776155
88.6%
ValueCountFrequency (%)
10 54
< 0.1%
100 8
 
< 0.1%
1000 36
< 0.1%
1300 1
 
< 0.1%
1500 2
 
< 0.1%
1800 1
 
< 0.1%
1900 3
 
< 0.1%
1980 1
 
< 0.1%
2000 2
 
< 0.1%
2050 1
 
< 0.1%
ValueCountFrequency (%)
100000000 1
< 0.1%
19500000 1
< 0.1%
15950000 1
< 0.1%
10000000 1
< 0.1%
4756000 1
< 0.1%
4640000 1
< 0.1%
3980000 2
< 0.1%
3850000 1
< 0.1%
3600000 1
< 0.1%
3323200 2
< 0.1%

recent_sale_count
Real number (ℝ)

SKEWED  ZEROS 

Distinct140
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.086101445
Minimum0
Maximum841
Zeros854339
Zeros (%)97.5%
Negative0
Negative (%)0.0%
Memory size6.7 MiB
2023-12-03T22:40:42.145903image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum841
Range841
Interquartile range (IQR)0

Descriptive statistics

Standard deviation2.3665654
Coefficient of variation (CV)27.48578
Kurtosis39380.136
Mean0.086101445
Median Absolute Deviation (MAD)0
Skewness159.69803
Sum75442
Variance5.6006317
MonotonicityNot monotonic
2023-12-03T22:40:42.257135image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 854339
97.5%
1 13311
 
1.5%
2 3585
 
0.4%
3 1561
 
0.2%
4 822
 
0.1%
5 518
 
0.1%
6 375
 
< 0.1%
7 238
 
< 0.1%
8 191
 
< 0.1%
9 158
 
< 0.1%
Other values (130) 1101
 
0.1%
ValueCountFrequency (%)
0 854339
97.5%
1 13311
 
1.5%
2 3585
 
0.4%
3 1561
 
0.2%
4 822
 
0.1%
5 518
 
0.1%
6 375
 
< 0.1%
7 238
 
< 0.1%
8 191
 
< 0.1%
9 158
 
< 0.1%
ValueCountFrequency (%)
841 1
< 0.1%
664 1
< 0.1%
567 1
< 0.1%
523 2
< 0.1%
429 1
< 0.1%
412 1
< 0.1%
369 1
< 0.1%
299 1
< 0.1%
288 1
< 0.1%
247 1
< 0.1%

Interactions

2023-12-03T22:40:26.550850image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-03T22:40:26.013392image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-03T22:40:26.814271image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-03T22:40:26.275565image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Missing values

2023-12-03T22:40:28.353605image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-03T22:40:30.219828image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

category_large_desccategory_middle_desccategory_small_descitemcolorusematerialprint_patterncharacteristicdetailageweightseasonsensibilitysale_pricerecent_sale_count
0여성복원피스롱원피스NaN네온/베이지NaN폴리에스터/코튼탄탄한/하이퀄리티맥시/워싱NaNNaNNaN감각있는479000
1여성복원피스롱원피스플리츠원피스/루즈핏원피스/여름원피스/스트링원피스/반팔원피스/롱원피스베이지/네이비/블랙/블루/그린NaN폴리에스터/나일론NaN산뜻한핀턱/스트링/플리츠NaNNaNNaNNaN2590027
2여성복원피스롱원피스NaN블랙/그레이NaN실크/스판/레이온편안한NaNNaNNaNNaNNaN1691036
3여성복원피스롱원피스NaN아이보리/브라운/베이지/네이비/그레이/블랙NaN폴리에스터NaN슬릿/맥시/셔링/스트링/나시/랩NaNNaNNaNNaN3097056
4여성복원피스롱원피스NaN블랙/그레이데일리룩코튼스트라이프/별텐션감/심플한라운드넥/스트레이트핏NaNNaNNaNNaN4560014
5여성복원피스롱원피스롱셔츠/셔츠원피스/배색원피스/플리츠원피스/롱셔츠원피스블랙/화이트/베이지/네이비55사이즈폴리에스터NaN크롭/히든버튼/플리츠NaNNaNNaNNaN45000110
6여성복원피스롱원피스프릴원피스/캉캉원피스/러플원피스/레이스원피스블랙/화이트/베이지55사이즈실크/레이온/코튼부드러운/시원한캉캉/밴딩/크롭/프릴NaNNaNNaNNaN4980018
7여성복원피스롱원피스여름원피스/러플원피스/나들이원피스/프릴원피스/스트라이프원피스블랙/화이트/베이지55사이즈/날씬해보이는폴리에스터스트라이프/별/스트라이프날씬해보이는스트링/밴딩/크롭/프릴NaNNaNNaNNaN4480037
8여성복원피스롱원피스데일리원피스/스퀘어넥원피스블랙/화이트/핑크55사이즈/날씬해보이는폴리에스터/스판고급스러운/날씬해보이는크롭NaNNaNNaNNaN4880024
9여성복원피스롱원피스여름원피스/휴양지원피스/단추원피스/바캉스룩원피스블랙/화이트55사이즈실크/레이온/코튼기하학/별부드러운크롭/히든버튼NaNNaNNaNNaN4780014
category_large_desccategory_middle_desccategory_small_descitemcolorusematerialprint_patterncharacteristicdetailageweightseasonsensibilitysale_pricerecent_sale_count
876189여성복스커트롱스커트롱스커트/플리츠스커트/벨벳스커트/밴딩스커트/A라인스커트네이비/베이지/그레이/핑크/블랙/그린체형커버벨벳NaN여성스러운플리츠/밴딩/워싱NaNNaNNaNNaN150000
876190여성복스커트롱스커트롱스커트/밴딩스커트/스판스커트/슬릿스커트/울스커트/H라인스커트블랙/라이트블루/브라운/베이지NaN폴리에스터/울도톰한/편안한/깔끔한/부드러운밴딩/H라인NaNNaNNaNNaN184000
876191여성복스커트롱스커트겨울스커트/롱스커트/머메이드스커트/울스커트블랙/브라운/베이지NaN폴리에스터/울여성스러운/여리여리한/편안한핀턱/밴딩/스트링/머메이드/보트넥NaNNaNNaNNaN209000
876192여성복스커트롱스커트NaN차콜/블랙55사이즈폴리에스터NaN여성스러운/편안한밴딩/크롭NaNNaNNaNNaN146500
876193여성복스커트롱스커트롱스커트/밴딩스커트/겨울스커트/베이식스커트/울스커트/H라인스커트/기모스커트블랙/그린데일리룩별/플로럴남녀공용/텐션감슬릿/밴딩/피그먼트워싱/스트링/루즈핏/부츠컷/워싱NaNNaNNaNNaN125000
876194여성복스커트롱스커트NaN차콜/블랙/브라운NaN폴리에스터/아크릴NaNNaN밴딩/H라인NaNNaNNaNNaN348000
876195여성복스커트롱스커트NaN블랙/베이지NaN폴리에스터저렴한밴딩NaNNaNNaNNaN198000
876196여성복스커트롱스커트롱스커트/밴딩스커트/겨울스커트/슬릿스커트/밴딩롱스커트/울스커트블랙/브라운/핑크마실룩폴리에스터/스판NaN텐션감/편안한밴딩NaNNaNNaNNaN149000
876197여성복스커트롱스커트롱스커트/프린트스커트/플리츠롱스커트블랙/옐로우NaN폴리에스터NaN유니크한/부드러운밴딩NaNNaNNaNNaN195000
876198여성복스커트롱스커트롱스커트/밴딩니트스커트/니트스커트/레이스스커트/하객룩스커트아이보리/블랙/브라운/베이지55사이즈폴리에스터/스판NaN여성스러운밴딩/시스루NaNNaNNaNNaN225000

Duplicate rows

Most frequently occurring

category_large_desccategory_middle_desccategory_small_descitemcolorusematerialprint_patterncharacteristicdetailageweightseasonsensibilitysale_pricerecent_sale_count# duplicates
12522여성복상의티셔츠NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN198000179
12600여성복상의티셔츠NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN298000140
12561여성복상의티셔츠NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN248000105
12509여성복상의티셔츠NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN18000089
12556여성복상의티셔츠NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN24000086
12487여성복상의티셔츠NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN15800081
12440여성복상의티셔츠NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN9900079
12678여성복상의티셔츠NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN39800078
12595여성복상의티셔츠NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN29000077
12632여성복상의티셔츠NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN33900077